Towards the Identification of Keywords in the Web Site Text Content: A Methodological Approach
نویسندگان
چکیده
Since the creation of the web, the designers are looking for friendlier ways of make web page contents, which pictures, sounds, movies and free texts attract the users’ interest. Special attention receive the text content, because is the most frequently parameter used to retrieve information from the web. A simple way in order to understand the user’s text preferences, could be collect the words used in a searching. However, this information is only well-know for the owner of the specific searching engine. In this paper we introduce a methodology in order to extract the most interest words for a user in a particular web site, based of the user browsing behavior and the web page text content. The methodology was tested using data originated in a bank web site showing the effectiveness of our approach.
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملData Extraction using Content-Based Handles
In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملImage retrieval using the combination of text-based and content-based algorithms
Image retrieval is an important research field which has received great attention in the last decades. In this paper, we present an approach for the image retrieval based on the combination of text-based and content-based features. For text-based features, keywords and for content-based features, color and texture features have been used. Query in this system contains some keywords and an input...
متن کاملMethodological Criticism of Misbah ul-Hidaya and Miftah ul-Kifaya
“Misbah ul-Hidaya and Miftah ul-Kifaya” is one of the most famous works of Izzuddin Mahmood Bin Ali Kashani, the sixth century (AH) mystic. The author of this book first describes the religious beliefs and then elaborates on mystical customs and Sufism. References to Persian and Arabic poetry, and using verses of the Quran, Hadiths, and employing rhetorical figures are among the stylistic featu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJWIS
دوره 1 شماره
صفحات -
تاریخ انتشار 2005